Goto

Collaborating Authors

 Durham Region


Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRL

Ammar, Hussein A., Adve, Raviraj, Shahbazpanahi, Shahram, Boudreau, Gary, Bahceci, Israfil

arXiv.org Artificial Intelligence

--In the user-centric cell-free massive MIMO (UC-mMIMO) network scheme, user mobility necessitates updating the set of serving access points to maintain the user-centric clustering. Such updates are typically performed through handoff (HO) operations; however, frequent HOs lead to overheads associated with the allocation and release of resources. This paper presents a deep reinforcement learning (DRL)-based solution to predict and manage these connections for mobile users. Our solution employs the Soft Actor-Critic algorithm, with continuous action space representation, to train a deep neural network to serve as the HO policy. We present a novel proposition for a reward function that integrates a HO penalty in order to balance the attainable rate and the associated overhead related to HOs. We develop two variants of our system; the first one uses mobility direction-assisted (DA) observations that are based on the user movement pattern, while the second one uses history-assisted (HA) observations that are based on the history of the large-scale fading (LSF). Simulation results show that our DRL-based continuous action space approach is more scalable than discrete space counterpart, and that our derived HO policy automatically learns to gather HOs in specific time slots to minimize the overhead of initiating HOs. Our solution can also operate in real time with a response time less than 0 . Index T erms --Mobility, handoff, handover, user-centric, cell-free massive MIMO, distributed MIMO, deep-reinforcement learning, soft actor critic, machine learning, channel aging. User-centric cell-free massive MIMO (UC-mMIMO) is a wireless network architecture where each user is served by a custom group of neighboring access points (APs) which are connected to a central unit (CU) via fronthaul links [1]. Unlike the current cellular system that is based on macro base stations, UC-mMIMO deploys cooperative APs that jointly serve users without relying on a traditional cellular boundaries. UC-mMIMO helps to achieve reliable wireless connectivity and provides uniform performance throughout the network [1], [2]. However, this beyond-5G mobile wireless network architecture introduces the key challenge of determining the connections between the APs and the users when moving through the network [3].


Data-Driven Dimensional Synthesis of Diverse Planar Four-bar Function Generation Mechanisms via Direct Parameterization

Kim, Woon Ryong, Jung, Jaeheun, Ha, Jeong Un, Lee, Donghun, Shim, Jae Kyung

arXiv.org Artificial Intelligence

Planar four-bar mechanisms are widely used in mechanical systems due to their simplicity and versatility. In designing a mechanism to achieve a desired task, accurately calculating its dimensions--a process known as dimensional synthesis--is essential. However, even for four-bar mechanisms, this synthesis presents considerable challenges. Unlike kinematic analysis, which determines output motion from given dimensions, dimensional synthesis is an inverse problem: given a desired output motion, typically expressed as precision points, the objective is to determine the corresponding mechanism dimensions. Extensive research has been conducted on dimensional synthesis since Freudenstein [1] introduced his foundational analytical approach for four-bar mechanisms. Contemporary studies in this field follow two major approaches: exact synthesis, also known as the precision point approach, which aims to find mechanism dimensions that satisfy desired characteristics exactly only at a finite number of discrete precision points, and the approximate approach, which focuses on obtaining solutions that minimize structural error over the entire range of motion. In this study, a novel data-driven approach is proposed to solve the dimensional synthesis problem of multi-type four-bar function generation mechanisms, leveraging machine learning to bypass the need to solve complex systems of equations and conduct optimization tasks. A supervised learning framework consisting of three key components is proposed: 1) a large synthetic dataset, 2) a deep neural network model, and 3) effective training methods.


Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation

Thillainathan, Sarubi, Yuan, Songchen, Lee, En-Shiun Annie, Jayasena, Sanath, Ranathunga, Surangika

arXiv.org Artificial Intelligence

Fine-tuning multilingual sequence-to-sequence large language models (msLLMs) has shown promise in developing neural machine translation (NMT) systems for low-resource languages (LRLs). However, conventional single-stage fine-tuning methods struggle in extremely low-resource NMT settings, where training data is very limited. This paper contributes to artificial intelligence by proposing two approaches for adapting msLLMs in these challenging scenarios: (1) continual pre-training (CPT), where the msLLM is further trained with domain-specific monolingual data to compensate for the under-representation of LRLs, and (2) intermediate task transfer learning (ITTL), a method that fine-tunes the msLLM with both in-domain and out-of-domain parallel data to enhance its translation capabilities across various domains and tasks. As an application in engineering, these methods are implemented in NMT systems for Sinhala, Tamil, and English (six language pairs) in domain-specific, extremely low-resource settings (datasets containing fewer than 100,000 samples). Our experiments reveal that these approaches enhance translation performance by an average of +1.47 bilingual evaluation understudy (BLEU) score compared to the standard single-stage fine-tuning baseline across all translation directions. Additionally, a multi-model ensemble further improves performance by an additional BLEU score.


Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models

Naderi, Nariman, Safavi-Naini, Seyed Amir Ahmad, Savage, Thomas, Atf, Zahra, Lewis, Peter, Nadkarni, Girish, Soroush, Ali

arXiv.org Artificial Intelligence

This study evaluated self-reported response certainty across several large language models (GPT, Claude, Llama, Phi, Mistral, Gemini, Gemma, and Qwen) using 300 gastroenterology board-style questions. The highest-performing models (GPT-o1 preview, GPT-4o, and Claude-3.5-Sonnet) achieved Brier scores of 0.15-0.2 and AUROC of 0.6. Although newer models demonstrated improved performance, all exhibited a consistent tendency towards overconfidence. Uncertainty estimation presents a significant challenge to the safe use of LLMs in healthcare. Keywords: Large Language Models; Confidence Elicitation; Artificial Intelligence; Gastroenterology; Uncertainty Quantification


Agentic Search Engine for Real-Time IoT Data

Elewah, Abdelrahman, Elgazzar, Khalid

arXiv.org Artificial Intelligence

The Internet of Things (IoT) has enabled diverse devices to communicate over the Internet, yet the fragmentation of IoT systems limits seamless data sharing and coordinated management. We have recently introduced SensorsConnect, a unified framework to enable seamless content and sensor data sharing in collaborative IoT systems, inspired by how the World Wide Web (WWW) enabled a shared and accessible space for information among humans. This paper presents the IoT Agentic Search Engine (IoT-ASE), a real-time search engine tailored for IoT environments. IoT-ASE leverages Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) techniques to address the challenge of searching vast, real-time IoT data, enabling it to handle complex queries and deliver accurate, contextually relevant results. We implemented a use-case scenario in Toronto to demonstrate how IoT-ASE can improve service quality recommendations by leveraging real-time IoT data. Our evaluation shows that IoT-ASE achieves a 92\% accuracy in retrieving intent-based services and produces responses that are concise, relevant, and context-aware, outperforming generalized responses from systems like Gemini. These findings highlight the potential IoT-ASE to make real-time IoT data accessible and support effective, real-time decision-making.


CRUPL: A Semi-Supervised Cyber Attack Detection with Consistency Regularization and Uncertainty-aware Pseudo-Labeling in Smart Grid

Dash, Smruti P., Khandeparkar, Kedar V., Agrawal, Nipun

arXiv.org Artificial Intelligence

The modern power grids are integrated with digital technologies and automation systems. The inclusion of digital technologies has made the smart grids vulnerable to cyber-attacks. Cyberattacks on smart grids can compromise data integrity and jeopardize the reliability of the power supply. Traditional intrusion detection systems often need help to effectively detect novel and sophisticated attacks due to their reliance on labeled training data, which may only encompass part of the spectrum of potential threats. This work proposes a semi-supervised method for cyber-attack detection in smart grids by leveraging the labeled and unlabeled measurement data. We implement consistency regularization and pseudo-labeling to identify deviations from expected behavior and predict the attack classes. We use a curriculum learning approach to improve pseudo-labeling performance, capturing the model uncertainty. We demonstrate the efficiency of the proposed method in detecting different types of cyberattacks, minimizing the false positives by implementing them on publicly available datasets. The method proposes a promising solution by improving the detection accuracy to 99% in the presence of unknown samples and significantly reducing false positives.


Enabling AutoML for Zero-Touch Network Security: Use-Case Driven Analysis

Yang, Li, Rajab, Mirna El, Shami, Abdallah, Muhaidat, Sami

arXiv.org Artificial Intelligence

Zero-Touch Networks (ZTNs) represent a state-of-the-art paradigm shift towards fully automated and intelligent network management, enabling the automation and intelligence required to manage the complexity, scale, and dynamic nature of next-generation (6G) networks. ZTNs leverage Artificial Intelligence (AI) and Machine Learning (ML) to enhance operational efficiency, support intelligent decision-making, and ensure effective resource allocation. However, the implementation of ZTNs is subject to security challenges that need to be resolved to achieve their full potential. In particular, two critical challenges arise: the need for human expertise in developing AI/ML-based security mechanisms, and the threat of adversarial attacks targeting AI/ML models. In this survey paper, we provide a comprehensive review of current security issues in ZTNs, emphasizing the need for advanced AI/ML-based security mechanisms that require minimal human intervention and protect AI/ML models themselves. Furthermore, we explore the potential of Automated ML (AutoML) technologies in developing robust security solutions for ZTNs. Through case studies, we illustrate practical approaches to securing ZTNs against both conventional and AI/ML-specific threats, including the development of autonomous intrusion detection systems and strategies to combat Adversarial ML (AML) attacks. The paper concludes with a discussion of the future research directions for the development of ZTN security approaches.


Towards Zero Touch Networks: Cross-Layer Automated Security Solutions for 6G Wireless Networks

Yang, Li, Naser, Shimaa, Shami, Abdallah, Muhaidat, Sami, Ong, Lyndon, Debbah, Mérouane

arXiv.org Artificial Intelligence

The transition from 5G to 6G mobile networks necessitates network automation to meet the escalating demands for high data rates, ultra-low latency, and integrated technology. Recently, Zero-Touch Networks (ZTNs), driven by Artificial Intelligence (AI) and Machine Learning (ML), are designed to automate the entire lifecycle of network operations with minimal human intervention, presenting a promising solution for enhancing automation in 5G/6G networks. However, the implementation of ZTNs brings forth the need for autonomous and robust cybersecurity solutions, as ZTNs rely heavily on automation. AI/ML algorithms are widely used to develop cybersecurity mechanisms, but require substantial specialized expertise and encounter model drift issues, posing significant challenges in developing autonomous cybersecurity measures. Therefore, this paper proposes an automated security framework targeting Physical Layer Authentication (PLA) and Cross-Layer Intrusion Detection Systems (CLIDS) to address security concerns at multiple Internet protocol layers. The proposed framework employs drift-adaptive online learning techniques and a novel enhanced Successive Halving (SH)-based Automated ML (AutoML) method to automatically generate optimized ML models for dynamic networking environments. Experimental results illustrate that the proposed framework achieves high performance on the public Radio Frequency (RF) fingerprinting and the Canadian Institute for CICIDS2017 datasets, showcasing its effectiveness in addressing PLA and CLIDS tasks within dynamic and complex networking environments. Furthermore, the paper explores open challenges and research directions in the 5G/6G cybersecurity domain. This framework represents a significant advancement towards fully autonomous and secure 6G networks, paving the way for future innovations in network automation and cybersecurity.


Spatiotemporal Graph Neural Networks in short term load forecasting: Does adding Graph Structure in Consumption Data Improve Predictions?

Nguyen, Quoc Viet, Fernandez, Joaquin Delgado, Menci, Sergio Potenciano

arXiv.org Artificial Intelligence

Short term Load Forecasting (STLF) plays an important role in traditional and modern power systems. Most STLF models predominantly exploit temporal dependencies from historical data to predict future consumption. Nowadays, with the widespread deployment of smart meters, their data can contain spatiotemporal dependencies. In particular, their consumption data is not only correlated to historical values but also to the values of neighboring smart meters. This new characteristic motivates researchers to explore and experiment with new models that can effectively integrate spatiotemporal interrelations to increase forecasting performance. Spatiotemporal Graph Neural Networks (STGNNs) can leverage such interrelations by modeling relationships between smart meters as a graph and using these relationships as additional features to predict future energy consumption. While extensively studied in other spatiotemporal forecasting domains such as traffic, environments, or renewable energy generation, their application to load forecasting remains relatively unexplored, particularly in scenarios where the graph structure is not inherently available. This paper overviews the current literature focusing on STGNNs with application in STLF. Additionally, from a technical perspective, it also benchmarks selected STGNN models for STLF at the residential and aggregate levels. The results indicate that incorporating graph features can improve forecasting accuracy at the residential level; however, this effect is not reflected at the aggregate level


An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification

More, Riddhi, Bradbury, Jeremy S.

arXiv.org Artificial Intelligence

Flaky tests exhibit non-deterministic behavior during execution and they may pass or fail without any changes to the program under test. Detecting and classifying these flaky tests is crucial for maintaining the robustness of automated test suites and ensuring the overall reliability and confidence in the testing. However, flaky test detection and classification is challenging due to the variability in test behavior, which can depend on environmental conditions and subtle code interactions. Large Language Models (LLMs) offer promising approaches to address this challenge, with fine-tuning and few-shot learning (FSL) emerging as viable techniques. With enough data fine-tuning a pre-trained LLM can achieve high accuracy, making it suitable for organizations with more resources. Alternatively, we introduce FlakyXbert, an FSL approach that employs a Siamese network architecture to train efficiently with limited data. To understand the performance and cost differences between these two methods, we compare fine-tuning on larger datasets with FSL in scenarios restricted by smaller datasets. Our evaluation involves two existing flaky test datasets, FlakyCat and IDoFT. Our results suggest that while fine-tuning can achieve high accuracy, FSL provides a cost-effective approach with competitive accuracy, which is especially beneficial for organizations or projects with limited historical data available for training. These findings underscore the viability of both fine-tuning and FSL in flaky test detection and classification with each suited to different organizational needs and resource availability.